Learning from the Ones that Got Away: Detecting New Forms of Phishing Attacks
نویسندگان
چکیده
Phishing attacks continue to pose a major headache for defenders of computing systems, often forming the first step in a multistage attack. There have been great strides made in phishing detection, however, some insidious kinds of phishing messages appear to pass through filters by making seemingly simple structural and semantic changes to the messages. We tackle this problem in this paper, through the use of a machine learning classifier operating on a large corpus of phishing and legitimate messages. By understanding common phishing features, we design a system to extract features and elevate some to higher level features that are meant to defeat common phishing mail construction strategies. The algorithms are instantiated in a usable system called SAFE-PC (Semi-Automated Feature generation for Phish Classification). To evaluate SAFE-PC, we collect the large corpus of phishing messages from the central IT organization at a tier-1 research university. The execution of SAFE-PC on the dataset exposes some hitherto unknown insights about phishing campaigns directed at university users. SAFE-PC can detect more than 70% of the emails that had eluded our production deployment of Sophos, a state-of-the-art email filtering tool today. It also performs better than SpamAssassin, a commonly used email filter. We also develop an online version of SAFEPC, that can be incrementally retrained with new samples. Its detection performance is found to improve with time as new samples are fed in, while the time to retrain it stays constant.
منابع مشابه
Detecting Fake Websites Using Swarm Intelligence Mechanism in Human Learning
The internet and its various services have made users to easily communicate with each other. Internet benefits including online business and e-commerce. E-commerce has boosted online sales and online auction types. Despite their many uses and benefits, the internet and their services have various challenges, such as information theft, which challenges the use of these services. Information thef...
متن کاملA Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection
Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that ...
متن کاملPhishing Website Detection based on Supervised Machine Learning with Wrapper Features Selection
The problem of Web phishing attacks has grown considerably in recent years and phishing is considered as one of the most dangerous Web crimes, which may cause tremendous and negative effects on online business. In a Web phishing attack, the phisher creates a forged or phishing website to deceive Web users in order to obtain their sensitive financial and personal information. Several conventiona...
متن کاملTracking Phishing Attacks Over Time
The so-called “phishing” attacks are one of the important threats to individuals and corporations in today’s Internet. Combatting phishing is thus a top-priority, and has been the focus of much work, both on the academic and on the industry sides. In this paper, we look at this problem from a new angle. We have monitored a total of 19,066 phishing attacks over a period of ten months and found t...
متن کاملTrustworthiness testing of phishing websites: A behavior model-based approach
Phishing attacks allure website users to visit fake web pages and provide their personal information. However, testing of phishing websites is challenging. Unlike traditional web-based program testing, we do not know the response of form submissions in advance. There exists lack of efforts to help anti-phishing professionals who manually verify a reported phishing site and take further actions....
متن کامل